A Bilingual Corpus of Inter-linked Events

نویسندگان

  • Tommaso Caselli
  • Nancy Ide
  • Roberto Bartolini
چکیده

This paper describes the creation of a bilingual corpus of inter-linked events for Italian and English. Linkage is accomplished through the Inter-Lingual Index (ILI) that links ItalWordNet with WordNet. The availability of this resource, on the one hand, enables contrastive analysis of the linguistic phenomena surrounding events in both languages, and on the other hand, can be used to perform multilingual temporal analysis of texts. In addition to describing the methodology for construction of the inter-linked corpus and the analysis of the data collected, we demonstrate that the ILI could potentially be used to bootstrap the creation of comparable corpora by exporting layers of annotation for words that have the same sense.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

Sentence-level aligning bilingual parallel corpus is shown significant and indispensable status in machine translation, translation knowledge acquiring and bilingual lexicography research fields, which is the fundamental work for natural language processing. Given the great deal of work in sentence alignment and a variety of methods have developed for bilingual terminology extraction, those are...

متن کامل

Evaluating Compound-to-compound Links in a Sub-sentence Aligned Bilingual Corpus through Example-based Element Recognition

This paper will present an algorithm that evaluates links between one-word compounds and two-word compounds in a bilingual corpus that has been aligned at the sub-sentence level. The phenomenon of linking one-word compounds to multi-word compounds is common when English is being linked to other Germanic languages, and it is difficult to get the links right in the alignment process. The algorith...

متن کامل

X-Linked Lissencephaly with Absent Corpus Callosum and Ambiguous Genitalia: A Case Report

Background: X-linked lissencephaly with ambiguous genitalia (XLAG) is a recently described genetic disorder, in which patients present with lissencephaly, agenesis of the corpus callosum, refractory epilepsy of neonatal onset, acquired microcephaly, and male genotype with ambiguous genitalia. XLAG is responsible for a severe neurological disorder of neonatal onset in boys. A gyration defect con...

متن کامل

Joint search in a bilingual valency lexicon and an annotated corpus

... so I say to you ... search, and you will find ... In this paper and the associated system demo, we present an advanced search system that allows to perform a joint search over a (bilingual) valency lexicon and a correspondingly annotated linked parallel corpus. This search tool has been developed on the basis of the Prague Czech-English Dependency Treebank, but its ideas are applicable in p...

متن کامل

Bilingual Dictionary Extraction from Wikipedia

The way of mining comparable corpora and the strategy of dictionary extraction are two essential elements of bilingual dictionary extraction from comparable corpora. This paper first proposes a method, which uses the interlanguage link in Wikipedia, to build comparable corpora. The large scale of Wikipedia ensures the quantity of collected comparable corpora. Besides, because the inter-language...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008